Cutting the Long Tail: Hybrid Language Models for Translation Style Adaptation

نویسندگان

  • Arianna Bisazza
  • Marcello Federico
چکیده

In this paper, we address statistical machine translation of public conference talks. Modeling the style of this genre can be very challenging given the shortage of available in-domain training data. We investigate the use of a hybrid LM, where infrequent words are mapped into classes. Hybrid LMs are used to complement word-based LMs with statistics about the language style of the talks. Extensive experiments comparing different settings of the hybrid LM are reported on publicly available benchmarks based on TED talks, from Arabic to English and from English to French. The proposed models show to better exploit in-domain data than conventional word-based LMs for the target language modeling component of a phrase-based statistical machine translation system.

منابع مشابه

Learning Style Preferences in Male and Female Professional Translators

This study investigated learning style preferences among professional translators. The purposes of the study were to (a) find the prevailing learning style among the Iranian professional translators; (b) reveal any significant difference in the translators’ learning style preferences in terms of gender; and (c) find any significant difference between individual learning style and translation co...

متن کامل

A Stylistic Approach to Translation: Figurative language devices in the Persian renderings of Alcott's Little Women

The present study aimed firstly at investigating the impact of translators' style on figurative language translation from English into Persian. Secondly, it intended to find which strategies were most frequently adopted by Persian translators to translate figures of speech into Persian. Lastly, the study sought to check the extent of transference of figurative features of literary texts in Engl...

متن کامل

A Stylistic Approach to Translation: Figurative language devices in the Persian renderings of Alcott's Little Women

The present study aimed firstly at investigating the impact of translators' style on figurative language translation from English into Persian. Secondly, it intended to find which strategies were most frequently adopted by Persian translators to translate figures of speech into Persian. Lastly, the study sought to check the extent of transference of figurative features of literary texts in Engl...

متن کامل

The Intersemiotic Study of Translation from Page to Stage: The Farsi Translation of Macbeth for Stage Adaptation from the Perspective of Peirceʼs Model

Intersemiotic translation, which can happen in the process of the translation of drama for theatre, can turn more complicated when the verbal sign system of drama has already undergone interlingual translation. The purpose of this study is to find the intersemiotic changes of translation from page to stage and to show the changes of indexical, iconic, and symbolic signs in the process of inters...

متن کامل

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012